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Typical address-oriented computer memories cannot rec- 
ognize incomplete or noisy information. Associative (content- 
addressable) memories solve this problem but suffer from se- 
vere capacity shortages. I propose a model of a quantum 
memory that solves both problems. The storage capacity is 
exponential in the number of qbits and thus optimal. The re- 
trieval mechanism for incomplete or noisy inputs is probabilis- 
tic, with postselection of the measurement result. The output 
is determined by a probability distribution on the memory 
which is peaked around the stored patterns closest in Ham- 
ming distance to the input. 

PACS: 03.67.L 



Quantum computation is normally associated with 
new complexity classes which are inaccessible (in polyno- 
mial time) to classical Turing machines. In other words, 
quantum algorithms j| can drastically speed up the solu- 
tion of tasks with respect to their classical counterparts, 
the paramount examples being Shor's factoring algorithm 
H and Grover's search algorithm Qj. 

There is, however, another aspect of quantum compu- 
tation which represents a big improvement upon its clas- 
sical counterpart. In traditional computers the storage 
of information requires setting up a lookup table (RAM) . 
The main disadvantage of this address-oriented memory 
system lies in its rigidity. Retrieval of information re- 
quires a precise knowledge of the memory address and, 
therefore, incomplete or noisy inputs are not permitted. 

In order to address this shortcoming, models of asso- 
ciative (or content-addressable) memories |5| were intro- 
duced. Here, recall of information is possible on the basis 
of partial knowledge of their content, without knowing 
the storage location. These are examples of collective 
computation on neural networks JpJ, the best known ex- 
ample being the Hopfield model ||and its generalization 
to a bidirectional associative memory 0. 

While these models solve the problem of recalling in- 
complete or noisy inputs, they suffer from a severe capac- 
ity shortage. Due to the phenomenon of crosstalk, which 
is essentially a manifestation of the spin glass transition 
|^| in the corresponding spin systems, the maximum num- 
ber of binary patterns that can be stored in a Hopfield 
network of n neurons is p m ax — 0.14 n § . While various 
possible improvements can be introduced g], the maxi- 
mum number of patterns remains linear in the number 
of neurons, p max = 0(n). 

In this paper I show that quantum mechanical entan- 



glement provides a natural mechanism for both improv- 
ing dramatically the storage capacity of associative mem- 
ories and retrieving noisy or incomplete information. In- 
deed, the number of binary patterns that can be stored 
in such a quantum memory is exponential in the num- 



ber n of qbits, p r , 



2™, i.e. it is optimal in the sense 



that all binary patterns that can be formed with n bits 
can be stored. The retrieval mechanism is probabilis- 
tic, with postselection of the measurement result. This 
means that one has to repeat the retrieval algorithm until 
a threshold T is reached or the measurement of a control 
qbit yields a given result. In the former case the input 
is not recognized. In the latter case, instead, the output 
is determined itself by a probability distribution on the 
memory which is peaked around the stored patterns clos- 
est (in Hamming distance) to the input. The efficiency of 
this information retrieval mechanism depends on the dis- 
tribution of the stored patterns. Recognition efficiency 
is best when the number of stored patterns is very large 
while identification efficiency is best for isolated patterns 
which are very different from all other ones, both very 
intuitive features. 

Let me start by describing the elementary quantum 
gates ||] that I will use in the rest of the paper. First of 
all there are the single-qbit gates NOT, represented by 
the first Pauli matrix a±, and H (Hadamard), with the 
matrix representation 



(i) 



Then, I will use extensively the two-qbit XOR (exclusive 
OR) gate, which performs a NOT on the second qbit if 
and only if the first one is in state In matrix notation 
this gate is represented as XOR = diag(l,cri), where 1 
denotes a two-dimensional identity matrix and o\ acts on 
the components 1 01} and |11) of the Hilbert space. The 
2XOR, or Toffoli gate [|| is the three qbit generalization 
of the XOR gate: it performs a NOT on the third qbit 
if and only if the first two are both in state |1). In ma- 
trix notation it is given by 2XOR = diag (1, 1, o\). In 
the storage algorithm I shall make use also of the nXOR 
generalization of these gates, in which there are n con- 
trol qbits. This gate is also used in the subroutines im- 
plementing the oracles underlying Grover's algorithm 
and can be realized using unitary maps affecting only 
few qbits at a time ||, which makes it feasible. All these 
are standard gates. In addition to them I introduce the 
two-qbit controlled gates 
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CS l = |0}(0|® 1 + |1)(1|®S 1 , 

1 i-1 




(2) 



for i = 1, . . . ,p. These have the matrix notation CS l = 
diag(l,S' i ). For all these gates I shall indicate by sub- 
scripts the qbits on which they are applied, the control 
qbits coming always first. 

Given p binary patterns pi of length n, it is not diffi- 
cult to imagine how a quantum memory can store them. 
Indeed, such a memory is naturally provided by the fol- 
lowing superposition of n entangled qbits: 



|M) 



f 



(3) 



The only real question is how to generate this state uni- 
tarily from a simple initial state of n qbits. To this end 
one can use the algorithm proposed in ]ic| . Here, how- 
ever, I shall propose a simplified version. 

In constructing \M) I shall use three registers: a first 
register p of n qbits in which I will subsequently feed 
the patterns p l to be stored, a utility register u of two 
qbits prepared in state |01), and another register m of 
n qbits to hold the memory. This latter will be initially 
prepared in state |0j.,. . . ,0„). The full initial quantum 
state is thus 



|V»o> = |pi,--.^;01;0 1 ,...,0 n ) 



(4) 



The idea of the storage algorithm is to separate this state 
into two terms, one corresponding to the already stored 
patterns, and another ready to process a new pattern. 
These two parts will be distinguished by the state of the 
second utility qbit u-i: |0) for the stored patterns and |1) 
for the processing term. 

For each pattern p l to be stored one has to perform 
the operations described below: 



W[) = ;Q 2XOR 



(5) 



3=1 



This simply copies pattern p l into the memory register 
of the processing term, identified by \u<i) = 



|V4>= I] NOT mj XOR p i m . K 



3 = 1 



|V4)= nXOR mi _ mnUl \ip l 2 ) 



(6) 



The first of these operations makes all qbits of the mem- 
ory register |l)'s when the contents of the pattern and 
memory registers are identical, which is exactly the case 
only for the processing term. Together, these two op- 
erations change the first utility qbit u\ of the processing 
term to a |1), leaving it unchanged for the stored patterns 
term. 



IV'3 



(7) 



This is the central operation of the storing algorithm. It 
separates out the new pattern to be stored, already with 
the correct normalization factor. 



|$5>= nXOR mi ... mnUl \^\) , 
l 

|4)= J] XOR p]mj NOT mj |^> 



(8) 



These two operations are the inverse of eqs.(|(j) and re- 
store the utility qbit u\ and the memory register m to 
their original values. After these operations on has 



h/4> = 4= ElP i ;00;p fc ) + J-—^\p i ;01;p i ) . (9) 



With the last operation, 



|</4> = J] 2XOR p]U2mj |^) 



(10) 



one restores the third register m of the processing term, 
the second term in eq.(||) above, to its initial value 
|0i, . . . ,0„). At this point one can load a new pattern 
into register p and go through the same routine as just 
described. At the end of the whole process, the m-register 
is exactly in state \M), eq. (||). 

Assume now one is given a binary input i, which might 
be, e.g. a corrupted version of one of the patterns stored 
in the memory. The first step of the information recall 
process is to make a copy of the memory \M) to be used 
in the retrieval algorithm described below. Due to the 
no-cloning theorem fll|| , this cannot be done determin- 
istically (i.e. using only unitary operations); a faithful 
copy of \M) can be obtained only with a probabilistic 
cloning machine [H . I shall thus assume the availability 
of a probabilistic cloning machine for which \M) is one of 
the set of linearly independent states that can be copied. 

The retrieval algorithm requires also three registers. 
The first register i of n qbits contains the input pattern; 
the second register m, also of n qbits, contains the mem- 
ory \M); finally there is a single qbit control register c 
initialized to the state (|0) + |1)) j\f2. The full initial 
quantum state is thus 

1 p 

Ho}= -y= J2\ii,...,i n ;Pi,---,Pn;0) 
1 P 

I now apply to it the following combination of quantum 
gates: 



= J] NOT mk XOR lkm M 



(12) 



fc=i 
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where, as before, the subscripts on the gates refer to the 
qbits on which they are applied. As a result of this, the 
memory register qbits are in state |1) if ij and p k are 
identical and |0) otherwise: 

1 P 

H'i)= E In, . . • ,«»; 4, • • • > <t; o) 
1 p 

= ^|zi,...,i„;4,...,4;l) , (13) 

where d k = 1 if and only if ij — p k and d k = otherwise. 
Consider now the following Hamiltonian: 



H= (d H ) m ® (<t 3 ) c , 

<T3_+1\ 



k=l 



(14) 



where 173 is the third Pauli matrix. Ti measures the num- 
ber of 0's in register m, with a plus sign if c is in state 
|0) and a minus sign if c is in state |1). Given how I have 
prepared the state \ipi) , this is nothing else than the num- 
ber of qbits which are different in the input and memory 
registers i and m. This quantity is called the Hamming 
distance and represents the (squared) Euclidean distance 
between two binary patterns. 

Every term in the superposition ( [l3| ) is an eigenstate of 
Ti. with a different eigenvalue. Applying thus the unitary 
operator exp(iirTl/2n) to |V>i) one obtains 



(15) 



l^>= 4p E e 4 ^^) • • ■ , in; dj, . • • , d k ; 0) 

+ 4= E e^-^^^lii, • ■ • , in, 4, . . . , d k n ; 1) , 

where <i# (i,P k ) denotes the Hamming distance bewteen 
the input i and the stored pattern p k . 

In the final step I restore the memory gate to the state 
\M) by applying the inverse transformation to eq. ( |l2] ) 
and I apply the Hadamard gate (|l|) to the control qbit, 
thereby obtaining 



l^ 3 }= H c 11 XOR lkmk NOT rnk IV2) , 



(16) 



1 P 7T 

V P fe=l " 
1 P 7T 

+— E sin (*^ fc ) |ii>--->in;Pi>--->^; 1 )- 

This concludes the deterministic part of the informa- 
tion retrieval process. At this point one needs a measure- 
ment of the control qbit c. The probabilities for this to 
be in states |0) and |1) are given by the expressions 



P(|c) = |0))=E £ cos 2 (|-dH , (17) 

k=l P 

p(\c) = ii»= E - sin2 {^ d " i^p k )) ■ ( 18 ) 



fc=i 



If the input pattern is very different from all stored pat- 
terns, one has a high probability of measuring |c) = |1). 
On the contrary, an input pattern close to all stored pat- 
terns leads to a high probability of measuring |c) = |0). 
One can thus set a threshold T; if T repetitions of the 
retrieval algorithm all lead to a measurement \c) = |1) 
one classifies the input i as non-recognized. If one gets 
a measurement |c) = |0) before the threshold is reached, 
instead, one classifies the input i as recognized and one 
can proceed to a measurement of the memory register 
to identify it. This measurement yields pattern p k with 
probability 



(19) 



This probability is peaked around those patterns which 
have the smallest Hamming distance to the input. The 
highest probability of retrieval is thus realized for that 
(those) pattern which is most similar to the input. 

What about the efficiency of this information retrieval 
mechanism? Contrary to any classical counterpart, this 
efficiency depends here on two features: the threshold 
T determining recognition and the shape of the probabil- 
ity distribution in eq.(|l9|), determining the identification. 
The threshold T should be optimally chosen according to 
the probabilities in eqs.(17],[l8|) and depends thus on the 
distribution of the stored patterns. Indeed, the probabil- 
ity of recognition is determined by comparing (squared) 
cosines and sines of the distances to the stored patterns. 
It is thus clear that the worst case for recognition is the 
situation in which there is an isolated pattern, with the 
remaining patterns forming a tight cluster spanning all 
the largest distances to the first one. Let me suppose 
that p = O (n x ), x -C n, and assume for simplicity that 
p = l+X)fc=o CD an d the distribution is such that exactly 
all patterns of distances dn = n, n — 1, . . . , n — x to one 
isolated pattern are stored. If one presents exactly this 
isolated pattern as input, one of the (squared) cosines in 
cq.(|r^) is 1, while the rest all take the smallest possible 
values, giving 



P (| C ) = | » > I + * 
p 4n z 



(20) 



In order to have the best recognition efficiency also in 
this worst case, one should therefore choose the thresh- 
old T = 0(n) for x = 1 and T = O (n 2 ) for n > x > 2. 
While this entails a large number of repetitions, it is still 
polynomial in the number n of qbits and thus tractable. 
Note also that the required threshold diminishes when 
the number of stored patterns becomes very large, since, 
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in this case, the distribution of patterns becomes neces- 
sarily more homogeneous. Indeed, for the maximal num- 
ber of stored patterns p = 2™ one has P(\c) = |0)) = 1/2 
and the recognition efficiency becomes also maximal, as 
it should be. In the general case one can initially esti- 
mate the p recognition probabilities of the patterns by 
setting i = p k for k = 1, ... ,p in eq.([l7|). Letting P m - ln 
be the smallest of these, one can once and for all choose 
the threshold T of this memory as the nearest integer to 
1/fmin- I do not discuss here a possible quantum speed- 
up of this calculation since the main point of the present 
paper is the exponential storage capacity with retrieval 
of noisy inputs. 

While the recognition efficiency depends on compar- 
ing (squared) cosines and sines of the same distances in 
the distribution, the identification efficiency of eq.(|l9|) 
depends on comparing the (squared) cosines of the dif- 
ferent distances in the distribution. Specifically, it is best 
when one of the distances is zero, while all others are as 
large as possible, such that the probability of retrieval 
is completely peaked on one pattern. As a consequence, 
the identification efficiency is best when the recognition 
efficiency is worst and viceversa. 

Having described at length the information retrieval 
mechanism for complete, but possibly corrupted pat- 
terns, it is easy to incorporate also incomplete ones. 
To this end assume that only q < n qbits of the in- 
put are known and let me denote these by the indices 
{kl, . . . , kq}. After assigning the remaining qbits ran- 
domly, there are two possibilities. One can just treat the 
resulting complete input as a noisy one and proceed as 
above or, better, one can limit the operator ((ifj) m in the 
Hamiltonian ( |l4| ) to 

(d H ) m = t (^) , (21) 

so that the Hamming distances to the stored patterns are 
computed on the basis of the known qbits only. After this 
the pattern recall process continues exactly as described 
above. This second possibility has the advantage that it 
does not introduce random noise in the similarity mea- 
sure but it has the disadvantage that the operations of 
the memory have to be adjusted to the inputs. 

This brings me to the last point, the feasibility of 
the described algorithms. In this context I would like 
to point out that, in addition to the standard NOT, H 
(Hadamard), XOR, 2XOR (Toffoli) and nXOR gates 0] 
I have introduced only the two-qbit gates CS l in eq. (g) 
and the unitary operator exp (i7r7i/2n). It remains thus 
only to show that this latter can be realized by simple 
gates involving few qbits. To this end I introduce the 
single-qbit gate 

*-( e 'o*!). ™ 

and the two-qbit controlled |2j gate 



CU~ 2 = |0)(0|®1 + |1)(1|®[/- 2 . (23) 

It is then easy to check that exp (iirTt/2n) can be realized 
as follows: 

n n 

e**" |^i> = II ( CU ' 2 ) cm , II U ™* IV»i> , ( 24 ) 
i=l j=l 

where c is the control qbit in the first series of 
gates. Essentially, this means that one implements first 
exp (iTtdn /2n) and then one corrects by implementing 
exp (— iirdH/n) on that part of the quantum state for 
which the control qbit |c) is in state |1). This completes 
the proof of feasibility. 

It remains to point out that the information retrieval 
algorithm can be, in principle, generalized by substitut- 
ing the Hamiltonian ( |l4| ) with 

n = (f(d„)) m ®(a 3 ) c , (25) 

where / is any function satisfying /(0) = and f(n) = n. 
Such a generalization would above all have an influence 
on the identification efficiency by changing the shape of 
the probability distribution on the memory, which can 
be made narrower around the input. One can also give 
different weights to different qbits by introducing a non- 
trivial metric. The only restriction on all these gener- 
alizations is, as always, the feasibility of the resulting 
unitary evolution. 
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